Goto

Collaborating Authors

 blei and lafferty


Historia Magistra Vitae: Dynamic Topic Modeling of Roman Literature using Neural Embeddings

Ginn, Michael, Hulden, Mans

arXiv.org Artificial Intelligence

Dynamic topic models have been proposed as a tool for historical analysis, but traditional approaches have had limited usefulness, being difficult to configure, interpret, and evaluate. In this work, we experiment with a recent approach for dynamic topic modeling using BERT embeddings. We compare topic models built using traditional statistical models (LDA and NMF) and the BERT-based model, modeling topics over the entire surviving corpus of Roman literature. We find that while quantitative metrics prefer statistical models, qualitative evaluation finds better insights from the neural model. Furthermore, the neural topic model is less sensitive to hyperparameter configuration and thus may make dynamic topic modeling more viable for historical researchers.


Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation

Neural Information Processing Systems

We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory.


The Dynamic Embedded Topic Model

Dieng, Adji B., Ruiz, Francisco J. R., Blei, David M.

arXiv.org Machine Learning

Topic modeling analyzes documents to learn meaningful patterns of words. Dynamic topic models capture how these patterns vary over time for a set of documents that were collected over a large time span. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution whose parameter is given by the inner product between the word embedding and an embedding representation of its assigned topic at a particular time step. The word embeddings allow the D-ETM to generalize to rare words. The D-ETM learns smooth topic trajectories by defining a random walk prior over the embeddings of the topics. We fit the D-ETM using structured amortized variational inference. On a collection of United Nations debates, we find that the D-ETM learns interpretable topics and outperforms D-LDA in terms of both topic quality and predictive performance.


Scalable Generalized Dynamic Topic Models

Jähnichen, Patrick, Wenzel, Florian, Kloft, Marius, Mandt, Stephan

arXiv.org Machine Learning

Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular topic models, and also limit scalability. In this paper, we present several new results around DTMs. First, we extend the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs). This allows us to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection). Second, we show how to perform scalable approximate inference in these models based on ideas around stochastic variational inference and sparse Gaussian processes. This way we can train a rich family of DTMs to massive data. Our experiments on several large-scale datasets show that our generalized model allows us to find interesting patterns that were not accessible by previous approaches.


Opinion mining from twitter data using evolutionary multinomial mixture models

Hasnat, Md. Abul, Velcin, Julien, Bonnevay, Stéphane, Jacques, Julien

arXiv.org Machine Learning

Image of an entity can be defined as a structured and dynamic representation which can be extracted from the opinions of a group of users or population. Automatic extraction of such an image has certain importance in political science and sociology related studies, e.g., when an extended inquiry from large-scale data is required. We study the images of two politically significant entities of France. These images are constructed by analyzing the opinions collected from a well known social media called Twitter. Our goal is to build a system which can be used to automatically extract the image of entities over time. In this paper, we propose a novel evolutionary clustering method based on the parametric link among Multinomial mixture models. First we propose the formulation of a generalized model that establishes parametric links among the Multinomial distributions. Afterward, we follow a model-based clustering approach to explore different parametric sub-models and select the best model. For the experiments, first we use synthetic temporal data. Next, we apply the method to analyze the annotated social media data. Results show that the proposed method is better than the state-of-the-art based on the common evaluation metrics. Additionally, our method can provide interpretation about the temporal evolution of the clusters.


Regularizing Flat Latent Variables with Hierarchical Structures

Lin, Rongcheng (University of North Carolina at Charlotte) | Li, Huayu (University of North Carolina at Charlotte) | Quan, Xiaojun (Institute for Infocomm Research) | Hong, Richang (Hefei University of Technology) | Wu, Zhiang (Nanjing University of Finance and Economics) | Ge, Yong (University of North Carolina at Charlotte)

AAAI Conferences

In this paper, we propose a stratified topic model (STM). Instead of directly modeling and inferring flat topics or hierarchically structured topics, we use the stratified relationships in topic hierarchies to regularize the flat topics. The topic structures are captured by a hierarchical clustering method and play as constraints during the learning process. We propose two theoretically sound and practical inference methods to solve the model. Experimental results with two real world data sets and various evaluation metrics demonstrate the effectiveness of the proposed model.


Dependent Multinomial Models Made Easy: Stick Breaking with the P\'olya-Gamma Augmentation

Linderman, Scott W., Johnson, Matthew J., Adams, Ryan P.

arXiv.org Machine Learning

Many practical modeling problems involve discrete data that are best represented as draws from multinomial or categorical distributions. For example, nucleotides in a DNA sequence, children's names in a given state and year, and text documents are all commonly modeled with multinomial distributions. In all of these cases, we expect some form of dependency between the draws: the nucleotide at one position in the DNA strand may depend on the preceding nucleotides, children's names are highly correlated from year to year, and topics in text may be correlated and dynamic. These dependencies are not naturally captured by the typical Dirichlet-multinomial formulation. Here, we leverage a logistic stick-breaking representation and recent innovations in P\'olya-gamma augmentation to reformulate the multinomial distribution in terms of latent variables with jointly Gaussian likelihoods, enabling us to take advantage of a host of Bayesian inference techniques for Gaussian models with minimal overhead.


Probable convexity and its application to Correlated Topic Models

Than, Khoat, Ho, Tu Bao

arXiv.org Machine Learning

Non-convex optimization problems often arise from probabilistic modeling, such as estimation of posterior distributions. Non-convexity makes the problems intractable, and poses various obstacles for us to design efficient algorithms. In this work, we attack non-convexity by first introducing the concept of \emph{probable convexity} for analyzing convexity of real functions in practice. We then use the new concept to analyze an inference problem in the \emph{Correlated Topic Model} (CTM) and related nonconjugate models. Contrary to the existing belief of intractability, we show that this inference problem is concave under certain conditions. One consequence of our analyses is a novel algorithm for learning CTM which is significantly more scalable and qualitative than existing methods. Finally, we highlight that stochastic gradient algorithms might be a practical choice to resolve efficiently non-convex problems. This finding might find beneficial in many contexts which are beyond probabilistic modeling.


Managing sparsity, time, and quality of inference in topic models

Than, Khoat, Ho, Tu Bao

arXiv.org Artificial Intelligence

Noname manuscript No. (will be inserted by the editor) Abstract Inference is an integral part of probabilistic topic models, but is often nontrivial to derive an efficient algorithm for a specific model. It is even much more challenging when we want to find a fast inference algorithm which always yields sparse latent representations of documents. In this article, we introduce a simple framework for inference in probabilistic topic models, denoted by FW. This framework is general and flexible enough to be easily adapted to mixture models. It has a linear convergence rate, offers an easy way to incorporate prior knowledge, and provides us an easy way to directly trade off sparsity against quality and time. We demonstrate the goodness and flexibility of FW over existing inference methods by a number of tasks. Finally, we show how inference in topic models with nonconjugate priors can be done efficiently. Keywords Topic modeling · Fast inference · Sparsity · Tradeoff · Greedy sparse approximation 1 Introduction We are interested in the two important problems in developing probabilistic topic models: sparsity and time. The sparsity problem is to infer sparse latent representations of documents, while the second problem asks for an efficient inference algorithm for a topic model. These two problems have been attracting significant interest in recent years, because of their significant impacts and nontrivial nature. Inference is an integral part of any topic models, and is often NPhard (Sontag and Roy, 2011).


Variational Inference in Nonconjugate Models

Wang, Chong, Blei, David M.

arXiv.org Machine Learning

Mean-field variational methods are widely used for approximate posterior inference in many probabilistic models. In a typical application, mean-field methods approximately compute the posterior with a coordinate-ascent optimization algorithm. When the model is conditionally conjugate, the coordinate updates are easily derived and in closed form. However, many models of interest---like the correlated topic model and Bayesian logistic regression---are nonconjuate. In these models, mean-field methods cannot be directly applied and practitioners have had to develop variational algorithms on a case-by-case basis. In this paper, we develop two generic methods for nonconjugate models, Laplace variational inference and delta method variational inference. Our methods have several advantages: they allow for easily derived variational algorithms with a wide class of nonconjugate models; they extend and unify some of the existing algorithms that have been derived for specific models; and they work well on real-world datasets. We studied our methods on the correlated topic model, Bayesian logistic regression, and hierarchical Bayesian logistic regression.